破产预测的模型在几种现实世界情景中很有用,并且基于结构化(数值)以及非结构化(文本)数据,已经为任务提供了多个研究贡献。但是,缺乏常见的基准数据集和评估策略阻碍了模型之间的客观比较。本文基于新颖和已建立的数据集为非结构化数据方案介绍了这样的基准,以刺激对任务的进一步研究。我们描述和评估几种经典和神经基线模型,并讨论不同策略的好处和缺陷。特别是,我们发现基于静态内域字表示的轻巧的单词袋模型可获得令人惊讶的良好结果,尤其是在考虑几年中的文本数据时。这些结果进行了严格的评估,并根据数据的特定方面和任务进行了讨论。复制数据的所有代码,将发布实验结果。
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
Analogical proportions compare pairs of items (a, b) and (c, d) in terms of their differences and similarities. They play a key role in the formalization of analogical inference. The paper first discusses how to improve analogical inference in terms of accuracy and in terms of computational cost. Then it indicates the potential of analogical proportions for explanation. Finally, it highlights the close relationship between analogical proportions and multi-valued dependencies, which reveals an unsuspected aspect of the former.
translated by 谷歌翻译
A wide variety of model explanation approaches have been proposed in recent years, all guided by very different rationales and heuristics. In this paper, we take a new route and cast interpretability as a statistical inference problem. We propose a general deep probabilistic model designed to produce interpretable predictions. The model parameters can be learned via maximum likelihood, and the method can be adapted to any predictor network architecture and any type of prediction problem. Our method is a case of amortized interpretability models, where a neural network is used as a selector to allow for fast interpretation at inference time. Several popular interpretability methods are shown to be particular cases of regularised maximum likelihood for our general model. We propose new datasets with ground truth selection which allow for the evaluation of the features importance map. Using these datasets, we show experimentally that using multiple imputation provides more reasonable interpretations.
translated by 谷歌翻译
Brain tumor imaging has been part of the clinical routine for many years to perform non-invasive detection and grading of tumors. Tumor segmentation is a crucial step for managing primary brain tumors because it allows a volumetric analysis to have a longitudinal follow-up of tumor growth or shrinkage to monitor disease progression and therapy response. In addition, it facilitates further quantitative analysis such as radiomics. Deep learning models, in particular CNNs, have been a methodology of choice in many applications of medical image analysis including brain tumor segmentation. In this study, we investigated the main design aspects of CNN models for the specific task of MRI-based brain tumor segmentation. Two commonly used CNN architectures (i.e. DeepMedic and U-Net) were used to evaluate the impact of the essential parameters such as learning rate, batch size, loss function, and optimizer. The performance of CNN models using different configurations was assessed with the BraTS 2018 dataset to determine the most performant model. Then, the generalization ability of the model was assessed using our in-house dataset. For all experiments, U-Net achieved a higher DSC compared to the DeepMedic. However, the difference was only statistically significant for whole tumor segmentation using FLAIR sequence data and tumor core segmentation using T1w sequence data. Adam and SGD both with the initial learning rate set to 0.001 provided the highest segmentation DSC when training the CNN model using U-Net and DeepMedic architectures, respectively. No significant difference was observed when using different normalization approaches. In terms of loss functions, a weighted combination of soft Dice and cross-entropy loss with the weighting term set to 0.5 resulted in an improved segmentation performance and training stability for both DeepMedic and U-Net models.
translated by 谷歌翻译
Gaussian process training decomposes into inference of the (approximate) posterior and learning of the hyperparameters. For non-Gaussian (non-conjugate) likelihoods, two common choices for approximate inference are Expectation Propagation (EP) and Variational Inference (VI), which have complementary strengths and weaknesses. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, it does not automatically imply it is a good learning objective for hyperparameter optimization. We design a hybrid training procedure where the inference leverages conjugate-computation VI and the learning uses an EP-like marginal likelihood approximation. We empirically demonstrate on binary classification that this provides a good learning objective and generalizes better.
translated by 谷歌翻译
通常通过后处理,涉及降低和后续可视化来解释高维数据的聚类结果。这破坏了数据的含义并混淆了解释。我们提出了算法 - 敏捷的解释方法,以在缩小尺寸中解释聚类结果,同时保留数据的完整性。集群的置换特征重要性代表基于改组特征值并通过自定义分数功能衡量群集分配的变化的一般框架。集群的个体条件期望表明由于数据的变化而导致群集分配的观察变化。聚类的部分依赖性评估整个特征空间的群集分配的平均变化。所有方法都可以与能够通过软标签重新分配实例的任何聚类算法一起使用。与常见的后处理方法(例如主组件分析)相反,引入的方法保持了特征的原始结构。
translated by 谷歌翻译
进化算法通常通过交叉和突变探索解决方案的搜索空间。虽然突变由溶液的局部局部修饰组成,但跨界将两种溶液的遗传信息混合在一起,以计算新的溶液。对于模型驱动的优化(MDO),模型直接提供了可能的解决方案(而不是首先将它们转换为另一种表示),仅开发了一个通用的跨界操作员。我们将图形作为模型的正式基础,我们进一步完善了该操作员的方式,以至于保留了其他良好形式的约束:我们证明,给定两个模型满足给定的一组多重性约束作为输入,我们的精制交叉运算符计算两个新模型作为输出,也满足一组约束。
translated by 谷歌翻译
无源域的适应性(SFDA)旨在通过仅使用预训练的源模型将分类器调整为未标记的目标数据集。但是,缺乏源数据和域移动使目标数据对目标数据的预测不可靠。我们建议量化源模型预测中的不确定性,并利用它来指导目标适应。为此,我们通过在网络参数上合并先验,构建一个概率源模型,从而在模型预测上诱导分布。通过采用拉普拉斯近似值来估算不确定性,并合并以识别不在源歧管中的目标数据点并在最大化目标数据上的共同信息时减少重量。与最近的作品不同,我们的概率处理是计算轻量级,脱离源训练和目标适应,并且不需要专门的源培训或模型体系结构的更改。我们显示了不确定性引导的SFDA比封闭设置和开放式设置中的传统SFDA的优势,并提供了经验证据,即即使没有调整,我们的方法对于强大的域转移也更为强大。
translated by 谷歌翻译
TensorFlow GNN(TF-GNN)是张量曲线的图形神经网络的可扩展库。它是从自下而上设计的,以支持当今信息生态系统中发生的丰富的异质图数据。Google的许多生产模型都使用TF-GNN,最近已作为开源项目发布。在本文中,我们描述了TF-GNN数据模型,其KERAS建模API以及相关功能,例如图形采样,分布式训练和加速器支持。
translated by 谷歌翻译